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ABSTRACT 

Overcomplete transforms, like the Dual-Tree Complex 
Wavelet Transform, offer more flexible signal representa- 
tions than critically-sampled transforms, due to their proper- 
ties of shift invariance and directional selectivity. We show 
that many transform coefficients can be discarded with- 
out much reconstruction quality loss by forcing compen- 
satory changes in the remaining coefficients. We consider 
the convergence properties of an iterative projection system 
for achieving the usual coding aims of good sparsity with 
low reconstruction error. Results show how these measures 
translate to useful image compression performance. 

1. INTRODUCTION 

We have previously developed the Dual-Tree Complex 
Wavelet Transform (DT CWT) as a useful shift-invariant 
and directionally selective image analysis tool [1]. Here we 
consider how these properties may be harnessed for image 
compression, despite the DT CWT's 4: 1 redundancy (over- 
completeness). Matching Pursuits [2] is a well known tech- 
nique for coding with overcomplete dictionaries, but it tends 
to be very computationally demanding. In this paper we dis- 
cuss iterative projection techniques, introduced in [3], in or- 
der to achieve efficient coding with potentially lower com- 
putational costs. Bolcskei and Hlawatsch [4] have previ- 
ously examined the effect of additive noise in oversampled 
filter banks systems, and Fischer and Cristobal [5] have pro- 
posed an iterative loop similar to ours. 

When an overcomplete transform is employed, the in- 
verse transform involves a projection from the higher di- 
mensional transform space to the lower dimensional image 
space; e.g. from 4Ar-space to TV-space in the case of the 4: 1 
overcomplete DT CWT on an AT-pixel image. Hence within 
the 4Ar-space there is the TV -dimensional range space and 
an orthogonal 3iV-dimensional null space of the transform. 
Movement within the range space produces changes in the 
output image whereas movement within the null space pro- 
duces no change; and so different configurations of wavelet 
coefficients with the same range space component, but with 
different components in null space, can produce the same 
decoded output image. 




Fig. 1. Iterative-projection system block diagram, i is the 
iteration number. 



For image coding piuposes, we would like to find the 
configuration that concentrates image energy in as few non- 
zero wavelet coefficients as possible, while still producing a 
good approximation to the original image. The scheme pre- 
sented here attempts iteratively to modify large coefficients 
to compensate as far as possible for the loss of small coef- 
ficients, by minimizing the error between the output image 
and the original. 

2. TRANSFORM PROJECTIONS 

Figure 1 shows the block diagram of our iterative algorithm. 
Let X be an iV-vector representing the original iV-pixel real- 
valued image in ID- vectorized form, and let A be the My.N 
DT CWT analysis operator matrix, where M = AN and the 
rows of A are piu-ely real and alternately represent the real 
and imaginary parts of the transform bases. Then y = Ax, 
where y is the M-vector of real and imaginary parts of the 
DT CWT coefficients. The real N x M synthesis or recon- 
struction operator matrix is R, so that x = Ry. It is then 
easy to show that for a perfect reconstruction transform, R 
is given by: 

R = R -h U[I - AR] where R = [A^A]"^ A^ (1) 

Hence RA = RA = I , Note that U is an arbitrary con- 
stant matrix defining a family of possible R's, A^ is the 
transpose of A, and R is the pseudo-inverse of A. We shall 
assume that the energy of R is minimized, so that U = 0 
and R = R, which is closely approximated in the DT CWT. 
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Let 5 be the space of all transfonn domain signals y, 
obtainable from finite real input signals x; i.e. S is the range 
space of A. The projection operator onto 5 is '= AR. If 
S-^ is the orthogonal complement space of 5, the projection 
operator onto iS-"- isP-'- = I~ARsothaty = P*^y+P-'-y 
for any y. From equ. (1) we see that RP*^ = .RAR = R 
and RP-^ = R - RAR = 0. Hence P'^y is the range- 
space component of y which defines the output image x = 
Ry, while P-'-y is the null-space component that has no 
effect on X. 

The non-linear function f{y,0) in fig. 1 is designed 
to suppress small ampHtude coefficients, and optionally to 
quantize those of larger amplitude. The simplest useful 
function is a hard threshold centre-clipper, operating on 
the magnitudes of the complex coefficients to eliminate in- 
significant coefficients and leave the significant coefficients 
unaffected. It produces y = f{y,0), where the two com- 
ponents of the k^^ complex element of y , for A: = 1 . . . 2iV, 
are given by: 

i S/2ik-i-Hj2/2ifc Otherwise ^ 

When i = 0, yi is initiahzed to yo, the transform of x. 
The centre-clipper then sets all small coefficients of yi to be 
zero in y^. The error image is the error between x and 
Xi , the inverse transform of the sparse vector y ^ . 

Assuming k — 1 for the moment, is the transform 
of e^, and each wavelet coefficient in defines the amount 
of each wavelet basis function present in the error image. 
These components tend to modify the non-zero coefficients 
of yo such that they are increased in amplitude while the 
coefficients of yi, which were set to zero in yo, tend to be 
reduced compared with their amplitudes in yo. In this way, 
the error ei after one iteration of the loop is significantly re- 
duced compared with the mitial error eo- Further iterations 
continue to reduce the error until convergence occurs. 

For a more rigorous analysis, we spht y^+i into its 
range-space and null-space components: 

Yi+i = Yi + Wi = yt + feA(x - Ry^) (3) 
= yz + A;yo - kP^Yi = kyo + (I - kP^)yi 



= yo + P-^yt ifA:=.l 



(4) 



Hence, if k = 1, each new y equals the original yo plus 
any null-space components of the previous y. Since the null 
space is orthogonal to the range space, the range space com- 
ponent P'^yi = P^^yo = yo for all i; and so the inverse 
transform of every y^ will be x (i.e. every y^ is a valid 
representation of x in the transform domain). 

3. CONVERGENCE ANALYSIS 

We may analyze conveigence using the theory of Projection 
onto Convex Sets (POCS). Following Yang [6], a set is con- 
vex if any linear interpolation between any two members of 



the set is also in the set. The projection Ptf onto a set Ci 
from an arbitrary vector f finds the member of Ci that is 
closest to f . 

If the nonUnear function in fig. I is a centre-cUpper, as 
defined in equ. (2), we may separate the clipping action into 
two steps: the first step selects which coefficients of y^ are 
to be retained, by generating a mask vector jxii of zeros 
and ones; the second step multiplies y^ by mi (element-by- 
element) to give yi. Given the maisk nii, this second step is 
a projection Pi from y^ onto the convex set Ci of all vectors 
whose non-zero elements are those selected by ones in the 
mask. Hence y^ = Piy^ . 

From equ. (4), the remaining parts of the loop in fig. 1 
may be shown, if fc = 1, to be a projection P2 from y^ 
to yi+i = P2yi, where the convex set C2 is now the set 
of all vectors whose range-space component is yo = Ax. 
Hence our loop comprises repeated projections between Ci 
and (72, given by yi+i = P2Piyi. 

If the choice of mask rrii were fixed for all iterations, 
then sets Ci and C2. would be constant and, by the dieory of 
POCS, the loop would converge either to a point where the 
two sets overlap or to the closest pair of points in the two 
sets if they do not overlap (the more likely case for lossy 
compression). But nii is not constant, so we invoke the 
following argument. 

If we first choose mo based on picking the largest Mnz 
complex coefficients from yo as the non-zeros, and then 
iterate the POCS loop for L iterations with mi = mo 
(i.e. with constant Ci) until y^ approximately converges, we 
find the optimum y^ given that mj = mo. At convergence, 
yi+i yi and so the transform domain loop error is 



yi - yi ^ yi+i - y» = Wi = Aei ' 



(5) 



which is entkely within S. Now we can pick a better 
mi = m^, where m/, is based on selecting the largest 
Mnz coefficients from y^ as the non-zeros (as in a centre- 
clipper). This modifies Ci so that the error ||yi - yi|| be- 
comes smaller. This will result in a smaller image domain 
error II ei 1 1 for two reasons: (a) because ~ R(yo - Yi) = 
P'(yi - yi); and (b) because the new yi - y^ will probably 
no longer be in S and so ei, its projection into the image 
space, will be smaller still. Hence modifying the mask at 
regular intervals to simulate a centre-clipper, can produce 
further reductions in loop error, in addition to the reductions 
produced by POCS with a fixed mask. 

From here we can argue that updating the mask on ev- 
ery iteration will produce more rapid convergence than up- 
dating it less frequently after waiting for the POCS to con- 
verge each time. This converts our two-step cUpper back 
into a regular centre-clipper, and shows that the loop with 
centre-cHp non-linearity will converge to a point which lo- 
cally minimizes the image domain error ||ei||. Our experi- 
ments support the validity of this argument. 
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Fig. 2. Convergence of the PSNR (dB) vs number of non- 
zero complex coefficients Mnz for various increasing, con- 
stant and decreasing trajectories of Mnz with i, over 30 
iterations in each case. The dashed line shows the rate- 
distortion curve for the non-redundant DWT, for compari- 
son. (Note: the DWT coefficients are real, not complex.) 

4. CONVERGENCE EXPERIMENTS 

We now turn to ways of encouraging the local minimum 
to be a good one. We consider the effect of adapting the 
sparsity of yi with iteration i. 

First, note that there is hysteresis in the system. Suppose 
the clipping threshold is gradually being reduced. When a 
coefficient first exceeds the threshold, it suddenly appears 
in yi. Subsequent iterations of the loop tend to increase the 
amplitude of this coefficient significantly above the thresh- 
old. We would then have to raise the clipping threshold to 
perhaps twice its original value before that coefficient would 
disappear from y^ again. Hence there is hysteresis and the 
order in which coefficients are selected or removed during 
convergence becomes important. 

Figure 2 illustrates a test of this by plotting PSNR (= 
10 logio(255^Ar/||ei|p) against number of non-zero coeffi- 
cients Mnz' We use the 512 x 512 Lena image and 5 levels 
of wavelet decomposition. The six solid curves with crosses 
indicate how the PSNR varies with Mm, starting from six 
different initial values of M^z and all converging on a fi- 
nal desired M^ = 12000 . The initial values are 2400, 
4000, 8000, 12000, 18000 and 36000; and in each case 
Mnz converges to 12000 over 26 iterations in a geomet- 
ric series. Four further iterations are then performed with 
Mnz = 12000 to allow final conver:gence. We see that the 
curve with constant M„^ has the worst performance, and 
that it is better to increment Mnz from quite a low initial 
value than to decrement it from a high value. 

Taking the best result (the left-hand curve), it is clear 
that considerable performance gains are possible compared 
with a non-iterated system, whose performance is shown 
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Fig. 3. PSNR (dB) vs. Mm for the best iterative DT CWT 
scheme (upper curve), the non-iterated DWT (dashed curve) 
and the non-iterated DT CWT (lower curve). 

by the initial points on each of the six curves. For exam- 
ple we see that the best converged result achieves almost 
the same PSNR with 12000 coefficients (38.79 dB) that 
the non-iterated DT CWT achieves with 36000 coefficients 
(38.77 dB). Looking vertically, we also see that the best con- 
verged result is 4.66 dB better than the non-iterated result of 
34.11 dB. 

There are several ways to obtain small improvements on 
the basic POCS loop, described above. Two of these are: 

(a) There is loss around the loop due to both projection 
processes. We can compensate for this by* increasing 
k. From equ. (4), range-space components, of y^ have 
a gain of (1 — /c). For stability (1 — A;) must lie inside 
the unit circle, so 0 < /: < 2. In practice, we use 
A:= 1.8. 

i 

(b) The hysteresis of the system is caused by the very 
high gain of the centre-clipper characteristic around 
its threshold. To reduce this problem we replace the 
clipper with a similar non-linear function that has lim- 
ited maximum slope, such as the Wiener denoising 
function: 

r 0 if\y\<9 

y=<- l^'V-^^ otherwise 



This has zero gain below threshold and closely ap- 
proximates unit gain when |y| » 6, but it is con- 
tinuous and has a maximum gradient of 2. If this 
function replaces the centre-clipper for the early it- 
erations, then better patterns of non-zero coefficients 
are indeed produced and there is less need for the it- 
erations to start with a very small value of Mnz- 
These improvements contribute a small gain in final PSNR 
(typically 0.3 to 0.9 dB) and some improvement in rate of 
convergence. Figure 3 compares the performance of the best 



iterated DT CWT scheme (for A; = 1.8) over a range of 
Mnz values, with that of the DWT and DT CWT (both non- 
iterated). The iterated curve was produced by first using 
15 iterations of Wiener non-linearity and then 15 iterations 
of centre-clipping to converge to a good starting point with 
Mnz = 2400. Then we incremented Mni by about 2% on 
each of 100 further iterations to produce the 100 points in 
fig. 3. This figure seems to show a dramatic superiority for 
the iterated DT CWT scheme, but ihc complex DT CWT 
coefficients will need more bits to code each of them than 
the real DWT coefficients. However this will be much less 
than twice as many bits, because in a sparse data set the 
location of each non-zero coefficient often requires more 
bits to code it than the magnitude and sign (or phase) do. 

5. CODING RESULTS 

We now consider fully quantized systems (not just centre- 
clipped ones) arid estimate the bit rate based on simple en- 
tropy measurements of the quantized data at each scale. 
Proper coding methods such as those lised in SPIHT and 
JPEG2000 will give small improvements oyer simple en- 
tropy, but the re/anve perfonnances of the energy compres- 
sion and quantization processes, which are the main topic of 
this paper, should be little altered by this! For all the' results 
in this paper we used the standard Daubechies (9,7)-tap fil- 
ters in die DWT and the following filters in the DT CWT 
[1]: (13,19)-tap near-orthogonal fi.lters at level 1, 14-tap Q- 
shift filters below level 1 . 

For these tests, we took the upper curve of fig. 3 and 
at every third point introduced a 2-D circularly synmietric 
complex quantizer into the loop in place of the centre clip- 
per. The quantizer has Voronbi regions, made up of con- 
centric rings of equal width Ai2 around a circle of diam- 
eter 2 Ai?, centred on the origin. The central circle forms 
the *zero' bin of the quantizer and is undivided, while each 
ring is divided into approximately 'square' sectors for cod- 
ing the phase. There are 8 sectors in ring 1,12 in ring 2, 16 
in ring 3 etc. (i.e. 4(A: + 1) sectors in the ring of inner radius 
A; Ai2). The real coefficients of the DWT are coded using 
the equivalent 1-D quantizer in which the central *zero' bin 
is of width 2 Ait! and the other bins are all of width Ai?. 

TTie first-order entropies of the quantized samples were 
measured at each wavelet scale, and from these the number 
of bits to code the Lena image were calculated over a range 
of step sizes. For the iterated DT CWT the average number 
of bits to code each non-zero complex coefficient is 13.5 
when the PSNR is around 36 dB. For the DWT the average 
number of bits to code each non-zero real coefficient is 7.4 
at the same PSNR. Figure 4 shows that, over the range 32 to 
38 dB, theDT CWT scheme reduces the entropy by a factor 
of 0.86 (14%) at around 34 dB PSNR; This is equivalent to 
an improvement of 0.65 dB in PSNR for a given entropy. 

Subjective comparisons of image quality indicate that 
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Fig. 4. PSNR (dB) vs. entropy (bit/pel) for 512 x 512 Lena, 
comparing coding of the complex coefficients of the iterated 
DT CWT (solid curve with crosses) with coding of the real 
coefficients of the non-iterated DWT (dashed curve). 

coding of diagonal edges (such as those on Lena's hat) can 
be significantly improved with the DT CWT approach. The 
good directional selectivity of the DT CWT is clearly ben- 
eficial here. Further .work is required to develop good cod- 
ing schemes for the DT CWT (e.g. derivatives of SPIHT) 
which take full advantage of the spatial correlations in cod- 
ing the locations of the non-zero coefficients. At present the 
number of iterations (typically about 30) to achieve good 
solutions is rather too high to make the proposed method at- 
tractive in real-time coding appUcations of reasonably large 
images, although the decoding process is non-iterative and 
very efficient. ' ' 

6. REFERENCES 

[11 N G Kingsbury, "Complex wavelets for shift invariant analy- 
sis and fi Itering of signals,** Applied and Computational Har- 
monic Analysis, vol. 10, no. 3, pp. 234^253, May 2001. 

[2] S G Mallat and Z Zhang, "Matching pursuits with time- 
frequency dictionaries," IEEE Trans, Signal Pwc, vol. 41, 
' pp. 3397-3415, Dec. 1993. 

[3] T H Reeves and N G Kingsbury, ''Overcomplete image coding 
using iterative projection-based noise shaping,** in ICIP 02, 
Rochester, Sept 2002, paper 2492. 

[41 H Bolcskei and F Hlawatsch, "Ovefsampled filter banks: Op- 
timal noise shaping, design freedom, and noise analysis,*' in 
ICASSP 97, April 1997. pp. 2453-2456. 

[5] S Fischer and G Cristobal, "Minimum entropy transform us- 
ing gabor wavelets for image compression,'* in ICIAP'O], 
Palermo, Sept 2001. 

[61 Y Yang and N P Galatsanos, '^Removal of compression arti- 
facts using projections onto convex sets and line process mod- 
elling,** IEEE Trans. Image Proc, pp. 1345-1357, Oct. 1997. 



